© Laurenz Albe 2024
Every year in fall, there is a new PostgreSQL release. After looking at the highlights of PostgreSQL v17, you may think, “what's the big deal?” Quite a few people might even be unhappy about the reminder that they should really upgrade some time soon. Time to explain how wonderful PostgreSQL v17 is!
Well, there are — I'm going to rant about them later. But there is certainly no eye-catcher like “automatic sharding for friction-less horizontal scaling” or “built-in automatic fail-over for high availability”. That's not because PostgreSQL has lost its momentum: in fact, there are more contributors today than ever before. There are several explanations for this seeming lack of innovation.
Over the decades, PostgreSQL has grown a lot. If I think back to release 8.1, the first release I worked with: autovacuum was still something new and somewhat experimental, replication was something scary you did with Slony, and so on. The average DBA had never heard of PostgreSQL. It is amazing to think of all the new features that have come since. How could we ever live without them?
Many smart people have contributed many great things over the years. Most of the easy, obvious improvements (and some difficult ones!) have already been made. The remaining missing features are the really hard ones.
Over the years, as the number of contributors and the world-wide importance of PostgreSQL have grown, so have the the demands on new contributions. Today, each code contribution has to go through a peer review process. An ever-increasing number of patches vie for the interest of reviewers and committers. Spending time to improve and merge somebody else's work is much less attractive than working on your own cool feature. This narrow bottleneck means that you need a lot of time and determination if you want to get your contribution c
[...]In previous blogs, we talked about an overview of PostgreSQL’s table access method API here , how sequential scan is handled within this API here, and how data insertion is handled here. Today in this blog, we will look closely into how PostgreSQL handles update. A successful update in PostgreSQL can be viewed as “insert a new record” while “marking the old record as invisible” due to the MVCC technique that PostgreSQL employs. It sounds simple enough, but there are quite a lot of considerations in place to make a successful update. Let’s dive in.
2 Table access method APIs are primarily involved when performing an update
To perform an update, PostgreSQL performs a series of checks and considerations before it can perform the upgrade. This process is illustrated in the diagram below:
The very first check the main update routine performs is to determine the columns to be updated. Particularly to find out if identity key columns have been update. This could be a primary key, index key, or partition key. It needs t
[...]
The soucre code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.
The extension is also available on PGXN.
Due to the problem that pgTAP is using the same function name, is_empty, there was a problem when pgTAP has been installed before pgsql-tweaks.
To solve this problem and to not break existing code, I decided to install the function under a different name, is_empty_b, when pgTAP is already installed.
When pgTAP is not installed, the function name is still is_empty to not break existing code using the function.
There is also a differnt in the the functions between the two implementations. While the pgTAP function returns text, the pgsql-tweaks function returns a boolean result.
I have also created an issue at the pgTAP. because I can only solve the problem of pgTAP has been installed before pgsql-tweaks.
There have been no new or changed features, therefore this is only a minor release.
The soucre code is available on GitLab, a mirror is hosted on GitHub.
One could install the whole package, or just copy what is needed from the source code.
The extension is also available on PGXN.
In this release the recommended PostgreSQL version has been changed to 17.
PostgreSQL 11 has been removed from the list of supported versions.
There have been no new or changed features, therefore this is only a minor release.
Upgrading your PostgreSQL cluster is an important task to keep your database running smoothly and securely. With each new release, PostgreSQL introduces performance improvements, security patches, and new features that can benefit your system. However, upgrading can be a bit tricky, especially if you're working in a Windows environment, where certain challenges like permissions, service management, and file handling may differ from Linux setups. In this blog, we’ll walk you through the process of performing an upgrade on a PostgreSQL cluster in Windows, covering the key steps to ensure everything goes smoothly without causing data loss.
The post Using pg_upgrade to Upgrading Your PostgreSQL Cluster on Windows appeared first on Stormatics.
In my last blog post, Evolution of Logical Replication, I mentioned the future development of a feature to allow "upgrades of logical replication nodes." The upcoming release of PostgreSQL 17 includes this feature. Previously, after major version upgrades (via pg_upgrade), users couldn't immediately connect and write data to logical replication nodes. This was because the slots were lost during upgrades, preventing replication from continuing. As a result, new writes wouldn't get replicated, causing data on both nodes to become out of sync. As explained in this blog post, users had to block applications from writing until the replication setup was re-enabled after the upgrade.
With PostgreSQL 17, logical replication nodes can be upgraded without blocking writes or requiring users to manually detach/attach subscriptions or create slots. The migration of logical slots is supported only when the old cluster is version 17.0 or later, so users will benefit from this feature when upgrading from 17.0 to 18.0 (or later versions). Refer to the PostgreSQL documentation for the prerequisites for upgrading publisher and subscriber nodes.
This blog will delve into the internals of this feature and provide an example of upgrading a two-node logical replica setup. The later part of this blog will talk about online upgrading physical replication setups.
Let's start with an example of upgrading a logical replica setup where the old and new versions of databases are PostgreSQL 17.
The PostGIS Team is pleased to release PostGIS 3.5.0! Best Served with PostgreSQL 17 RC1 and GEOS 3.13.0.
This version requires PostgreSQL 12 - 17, GEOS 3.8 or higher, and Proj 6.1+. To take advantage of all features, GEOS 3.12+ is needed. SFCGAL 1.4+ is needed to enable postgis_sfcgal support. To take advantage of all SFCGAL features, SFCGAL 1.5 is needed.
Cheat Sheets:
This release is a feature release that includes bug fixes since PostGIS 3.4.3, new features, and a few breaking changes.
A couple days ago I had a bit of free time in the evening, and I was bored, so I decided to play with BOLT a little bit. No, not the dog from a Disney movie, the BOLT tool from LLVM project, aimed at optimizing binaries. It took me a while to get it working, but the results are unexpectedly good, in some cases up to 40%. So let me share my notes and benchmark results, and maybe there’s something we can learn from it. We’ll start by going through a couple rabbit holes first, though.
As I mentioned in my last blog post, as your cluster grows with multiple standby servers and potentially automated failover (using tools like Patroni), it becomes more practical to set up a dedicated repository host, also known as a dedicated backup server. This backup server can then trigger backups and automatically select the appropriate node in case of failover, eliminating the need for manual intervention.
In this post, I’ll show you how easy it is to add a repository host to an existing cluster. I’ll also give you a sneak peek at a new feature expected to be included in the next pgBackRest release 😉
In this example, we pick up from where we left off last time: a primary server (pg1
) with a standby (pg2
), both already configured to use pgBackRest (with an NFS mount) for backups taken from the standby. Now, we will add a new node, repo1
, to take over pgBackRest backups.
The pgBackRest user guide provides a comprehensive overview of how to set up a repository host. Since pgBackRest needs to interact with local processes on each node, we must enable communication between the hosts, either through passwordless SSH or TLS with client certificates. While SSH is generally easier to set up, TLS offers better performance. If you’re interested in an example of the TLS setup, I wrote this blog post when the feature was first introduced.
Let’s return to our repository host setup. The first step, of course, is to install pgBackRest:
$ sudo dnf install pgbackrest -y
Any user can own the repository, but it’s best to avoid using the postgres
user (if it exists) to prevent confusion. Instead, let’s create a dedicated system user for this purpose:
$ sudo groupadd pgbackrest
$ sudo adduser -g pgbackrest -n pgbackrest
$ sudo chown -R pgbackrest: /var/log/pgbackrest/
The SSH setup is up to you, but usually it is as simple as creating SSH keys and authorize them on the other nodes. Example:
# From repo1
[pgbackre
As a senior database consultant and developer at CYBERTEC PostgreSQL International, I enjoyed speaking at PGDay Hyderabad 2024. This event marked two firsts for me: my first time visiting India and the first-ever PostgreSQL conference held in Hyderabad. I was filled with excitement and curiosity as I started this journey. I didn't know what to expect, but I was eager to experience it all.
I arrived in Hyderabad late at night, around 2:00 AM. While I was a little tired from the long flight, I was extremely excited.. However, I made two rookie mistakes. First, I didn't buy a local SIM card, which left me without mobile internet for the whole trip — a particularly unpleasant experience in India, where OTPs are required even for something as simple as Uber. Second, I didn't exchange money for local currency, leading to small logistical challenges. And the driving culture? Well, let's just say that seeing it live was a real surprise! ????
Being in India for the first time was a cultural shock — but in a good way! Everything here is different from what you're used to in Europe. The tempo of life is wild, and the cuisine is an adventure of its own. Yes, it's spicy, but I handled it! If you ever have the opportunity to visit India, please do so. It's an unforgettable experience.
Despite the rain pouring down the entire day, I was glad I took a full-day tour of Hyderabad. I visited Golconda Fort, The Charminar, Laad Bazaar, and Mecca Masjid Mosque. One of the highlights was stopping at the famous Nimrah Cafe, where the owner welcomed me and gave me a kitchen tour. I sampled an assortment of delicious pastries and tasted their renowned tea. Spending time there, soaking in the atmosphere, was a beautiful experience. I highly recommend it!
In general I had no particular expectations at my first PostgreSQL conference in India. I just wanted to live in the moment and take everything in as it un
[...]Since the PostgreSQL 17 RC1 on a home run towards the official PostgreSQL release, scheduled for September 26, 2024.
Letʼs take a look at the patches that came in during the March CommitFest. Previous articles about PostgreSQL 17 CommitFests: 2023-07, 2023-09, 2023-11, 2024-01.
Together, these give an idea of what the new PostgreSQL will look like.
Unfortunately, some previously accepted patches didn't make it in after all. Some of the notable ones:
Now, letʼs get to the new stuff.
With RC1 freshly cut, the release of Postgres 17 is right on the horizon, giving us a host of features, improvements, and optimizations to look forward to.
As a backend developer, one in particular pops off the page, distinguishing itself amongst the dozens of new release items:
Allow btree indexes to more efficiently find a set of values, such as those supplied by IN clauses using constants (Peter Geoghegan, Matthias van de Meent)
The B-tree is Postgres' overwhelmingly most common and best optimized index, used for lookups on a table's primary key or secondary indexes, and undoubtedly powering all kinds of applications all over the world, many of which we interact with on a daily basis.
During lookups, a B-tree is scanned, with Postgres descending down through its hierarchy from the root until it finds a target value on one of its leaf pages. Previously, multi-value lookups like id IN (1, 2, 3)
or id = any(1, 2, 3)
would require that process be repeated multiple times, once for each of the requested values. Although not perfectly efficient, it wasn't a huge problem because B-tree lookups are very fast. It'd take an extremely performance sensitive user to even notice the deficiency.
As of a Postgres 17 enhancement to nbtree's ScalaryArrayOp
execution, that's no longer always the case. Any particular scan with multiple scalar inputs will consider all those inputs as it's traversing a B-tree, and where multiple values land on the same leaf page, they're retrieved together to avoid repetitive traversals.
A narrowly focused script to demonstrate the original problem shows a dramatic performance increase before and after ScalaryArrayOp
improvement, so we already know the gains are very real. With Postgres 17 so close to hand, we wanted to try to measure what kind of gain a realistic web app might expect from the optimization by testing it against the real API service that powers Crunchy Bridge.
In our experiment we saw roughly a 30% improvement in throughput 20% drop in average r
[...]Floor Drees wrote an article, how the Postgres community migrated to Mastodon, and which tools can be used.
Elodie Jex (Instagram: elodie.s_art_) designed the three proposals for the PostgreSQL Europe Diversity Task Force, from which one was selected as the final logo.
The following people contributed to the translation of the press release for the upcoming PostgreSQL v17 version:
Recently, we’ve received many questions about how to take backups from a standby server using pgBackRest. In this post, I’d like to clarify one of the most frequently asked questions and address a common misconception for new users.
First of all, it’s important to understand that taking a backup exclusively from the standby server is not currently possible. When you trigger a backup from the standby, pgBackRest creates a standby backup that is identical to a backup performed on the primary. It does this by starting/stopping the backup on the primary, copying only files that are replicated from the standby, then copying the remaining few files from the primary.
For this setup to work, both the primary and standby servers must share a common backup repository. This can be any supported repository type.
Let’s take an example, using an NFS mount point.
Both the primary (pg1
) and the standby (pg2
) are seeing the same content of the mentioned NFS mount:
[postgres@pg1 ~]$ ls /shared/
[postgres@pg1 ~]$ touch /shared/test_write_from the primary
[postgres@pg1 ~]$ mkdir /shared/pgbackrest
[postgres@pg2 ~]$ ls /shared
pgbackrest test_write_from
And we’ve got a working replication connection:
postgres=# SELECT * FROM pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 27773
usename | replicator
application_name | pg2
state | streaming
sent_lsn | 0/500AD6C8
write_lsn | 0/500AD6C8
flush_lsn | 0/500AD6C8
replay_lsn | 0/500AD6C8
sync_state | async
Let’s configure pgBackRest on pg1
and pg2
:
$ pgbackrest version
pgBackRest 2.53.1
$ cat<<EOF | sudo tee "/etc/pgbackrest.conf"
[global]
repo1-path=/shared/pgbackrest
repo1-retention-full=4
repo1-bundle=y
repo1-block=y
start-fast=y
log-level-console=info
log-level-file=detail
delta=y
process-max=2
compress-type=zst
[mycluster]
pg1-path=/var/lib/pgsql/16/dat
Welcome to Part 7 of our series exploring the exciting new features anticipated in the official PostgreSQL 17 release. In this series, we delve into newly added features and discuss how they can benefit database developers and migration engineers transitioning to PostgreSQL 17 in future.
In case you missed the earlier parts:
Ready to enhance your PostgreSQL development skills? My course on PostgreSQL Migration and PL/pgSQL will help you master database development and Migration. Click here to start your journey, and use code DBGRP30 to avail 30% off!
Databases are designed to handle critical and sensitive information, and it’s essential to enforce minimal grants to control who can access certain data and what operations can be performed. While necessary grants on data access and processing are common, having permissions for maintenance activities is equally crucial to ensure business SLA adherence and performance optimization.
With PostgreSQL 17, a new predefined role, pg_maintain
, has been introduced to manage privileges for maintenance operations such as VACUUM
, ANALYZE
, REINDEX
, and more on database tables, indexes, or views.
This means that a database developer is not only responsible for building business-critical functionality but can also maintain its respective database objects like Table or indexes, ensuring they meet performance and business SLAs all governed by grants that can be precisely controlled.
The following maintenance activities are supported with the new pg_maintain
role, allowing for granular maintenance permissions at the table , index or view level:
VACUUM: Cleans up dead tuples and recovers storage.[...]
ANALYZ
As Joe said a few days ago: so many Postgres providers. Aiven, AWS, Azure, Crunchy, DigitalOcean, EDB, GCP, Heroku, Neon, Nile, Oracle, Supabase, Tembo, Timescale, Xata, Yugabyte… I’m sure there’s more I missed. And that’s not even the providers using Postgres underneath services they offer with a different focus than Postgres compatibility. (I noticed Qian Li’s upcoming PGConf NYC talk in 2 weeks… I have questions about DBOS!)
Kubernetes. I have a theory that more people are using kubernetes to run Postgres than we realize – even people on that list above. Neon’s architecture docs describe their sprinkling of k8s stardust (but not quite vanilla k8s; Neon did a little extra engineering here). There are hints around the internet suggesting some others on that list also found out about kubernetes.
And of course there are the Postgres operators. Crunchy and Zalando were first out of the gate in 2017. But not far behind, we had ongres and percona and kubegres and cloudnativepg.
We are database people. We are not actually a priesthood (the act is only for fun), but we are different. We are not like application people who can spin a compute container anywhere and everywhere without a care in the world. We have state. We are the arch enemies of the storage people. When the ceph team says they have finished their fio
performance testing, we laugh and kick off the database benchmark suite and watch them panic as their storage crumbles under the immense beating of our IOPS and their caches utterly fail to predict our read/write patterns.
But we all have at least one thing in common: none of us really want to pay for a bunch of servers to sit around and do nothing, unless it’s really necessary. Since the dawn of time: from mainframes to PowerVM to VMware and now to kubernetes. We’re hooked on consolidating better and saving more money and kubernetes is the best drug yet.
In kubernetes, you manage consolidation with two things: requests and limits.
Similarly to PostgreSQL, also SQLite3 needs some care…
Today I discovered, by accident I need to confess, that PostgreSQL is not the only database requiring VACUUM
: also SQLite3 does.
And there’s more: SQLite3 includes an auto-vacuum too! They behave similarly, at least in theory, to their PostgreSQL counterparts, but clearly there is no autovacuum daemon or process. Moreover, the configuration is simpler and I’ve not found any threshold as we have in PostgreSQL. In the following, I explain how VACUUM
works in SQLite3, at least at glance.
SQLite3 does not have a fully enterprise-level MVCC machinery as PostgreSQL has, but when tuples or tables are updated or deleted from a database, defragmentation and not reclaimed space makes the database file never shrink. Similarly to what PostgreSQL does, the now empty space (no more occupied by old tuples) is kept for future usage, so that the effect is that the database grows without never shrinking even after large data removal.
VACUUM
is the solution that also SQLite3 uses to reclaim space.
VACUUM is a command available to the SQLite3 prompt to start a manual space reclaiming. It works by copying the database file content into another (temporary) file and restructuring it, so nothing really fancy and new here!
Then comes auto-vacuum that is turned off by default. The autovacuum works in a full mode or an incremental mode. The former is the most aggressive, and happens after a COMMIT
. The second is the less intrusive, and “prepares” what the vacuum process has to do, without performing it. Is is only when [incremental_autovacuum](https://sqlite.org/pragma.html#pragma_incremental_vacuum){:target="_blank"}
is launched that the space is freed. Therefore, autovacuum is SQLite3 either executes at each COMMIT
or is postponed when considered safe to execute.
Solid Cache is a relatively new caching framework that’s available now as a Ruby gem. In the next major version of Ruby on Rails, version 8, it’s becoming the default cache backend.
Solid Cache has a noteworthy difference from alternatives in that it stores cache entries in a relational database and not a memory-based data store like Redis.
In this post, we’ll set up Solid Cache, explore the schema, operations, and discuss some Postgres optimizations to consider.
Before we do that, let’s discuss caching in relational vs. non-relational stores.
A big change in the last decade is that there are now fast SSD disk drives that have huge capacities at low price points.
SSDs attached locally to an instance, not over the network, offer very fast read and write access. This configuration is available whether self-hosting or using cloud providers.
Besides the hardware gains, PostgreSQL itself has improved its efficiency across many releases in the last decade. Features like index deduplication cut down on index sizes, offering faster writes and reads.
Developers can optimize reads for their application by leveraging things like database indexes, materialized views, and denormalization. These tactics all consume more space and can add latency to write operations, but can greatly improve read access speeds. With Solid Cache, we’ll primarily look at a single solid_cache_entries
table, and how it’s indexed. The indexes themselves will contain all cache entry data, and when they’re small enough based on available system memory, can fit entirely into the fast memory buffer cache.
With faster hardware, abundant storage capacities, and optimized indexes, keeping cache data in the relational database is starting to make more sense. Being able to reduce dependencies on multiple data stores can simplify operations and reduce costs.
Now that we’ve covered a bit about why to consider a relational store for cache, let’s flip it around. Why would we not want
[...]Welcome CNPG Playground, a local learning environment for exploring CloudNativePG and PostgreSQL in Kubernetes. Using Docker and Kind, it simulates real-world scenarios, enabling developers and DBAs to experiment with PostgreSQL replication across two clusters. Designed for hands-on learning, CNPG Playground provides an accessible entry point for testing configurations and features, with plans for future enhancements and community collaboration.
As businesses increasingly move toward open-source technologies, many Oracle Database professionals find themselves needing to work with PostgreSQL, one of the most popular open-source relational database management systems (RDBMS). Although both Oracle and PostgreSQL share many similar concepts, there are fundamental differences in how these systems handle certain database structures, one of which is the schema.
The post Transitioning from Oracle to PostgreSQL: Understanding the Concept of Schema appeared first on Stormatics.
The SQL standard defines a set of rules so that database systems can be interchangeable, but there are small singularities in the wild. In this regard, the hierarchyid
data type provided by SQL Server is a striking example. If you are switching to PostgreSQL, two solutions are available to you.
A first and simpler solution consists in linking each node to its parent using a new parentid
column and applying a foreign key constraint. Another, more complete approach consists in using the ltree
extension. This article deals with the latter case.
When we introduced columnar compression for Postgres in 2019, our goal was to help developers scale Postgres and efficiently manage growing datasets, such as IoT sensors, financial ticks, product metrics, and even vector data. Compression quickly became a game-changer, saving users significant storage costs and boosting query performance—all while keeping their data in Postgres. With many seeing over 95 % compression rates, the impact was immediate.
But we didn’t stop there. Recognizing that many real-time analytics workloads demand flexibility for updating and backfilling data, we slowly but surely enhanced our compression engine to support INSERT
, UPDATE
, and DELETE
(DML) operations directly on compressed data. This allowed users to work with compressed data almost as easily as they do with uncompressed data.
However, it also created a problem. While we had originally intended mutating compressed chunks to be a rare event, people were now pushing its limits with frequent inserts, updates, and deletes. Seeing our customers go all in on this feature confirmed that we were on the right track, but we had to double down on performance.
Today, we’re proud to announce significant improvements as of TimescaleDB 2.16.0, delivering up to 500x faster updates and deletes and 10x faster upserts on compressed data. These optimizations make compressed data behave even more like uncompressed data—without sacrificing performance or flexibility.
Let’s dive into how we achieved these performance gains and what they mean for you. To check this week’s previous launches and keep track of upcoming ones, head to this blog post or our launch page.
To understand our latest improvements, it helps to revisit how we initially threw away the rule book and enabled DML operations on compressed data.
Working with compressed data is tricky. Imagine trying to update a zipped file. You’d need to unzip the file, make your changes, and then zip it back up. Similarly, u
[...]Number of posts in the past two months
Number of posts in the past two months
Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.